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Abstract 

We study Erdos-Renyi random graphs with random weights associated with each link. We 
generate a new "Supernode network" by merging all nodes connected by links having weights below 
the percolation threshold (percolation clusters) into a single node. We show that this network is 
scale- free, i.e., the degree distribution is P{k) ~ k~ x with A = 2.5. Our results imply that 
the minimum spanning tree (MST) in random graphs is composed of percolation clusters, which 
are interconnected by a set of links that create a scale- free tree with A = 2.5. We show that 
optimization causes the percolation threshold to emerge spontaneously, thus creating naturally a 
scale-free "supernode network." We discuss the possibility that this phenomenon is related to the 
evolution of several real world scale- free networks. 
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Scale-free topology is very common in natural and man-made networks. Examples vary 
from social contacts between humans to technological networks such as the World Wide 
Web or the Internet Q fl Q. Sea le free (SF) networks are characterised by a power law 
distribution of connectivities P(k) ~ k~ x , where k is the degree of a node and the exponent 
A controls the broadness of the distribution. Many networks are observed to have values 
of A around 2.5. For values of A < 3 the second moment of the distribution (k 2 ) diverges, 
leading to several anomalous properties |4(. 

In many real world networks there is a "cost" or a "weight" associated with each link, 
and the larger the weight on a link, the harder it is to traverse this link. In this case, the 
network is called "weighted" Q]. Examples can be found in communication and computer 
networks, where the weights represent the bandwidth or delay time, in protein networks 
where the weights can be defined by the strength of interaction between proteins 0, [7 1 or 
their structural similarity ^|, and in sociology where the weights can be chosen to represent 
the strength of a relationship 0, lfl| . 

In this Letter we introduce a simple process that generates random scale-free networks 
with A = 2.5 from weighted Erdos-Renyi graphs 11]. We further show that the minimum 
spanning tree (MST) on an Erdos-Renyi graph is related to this network, and is composed 
of percolation clusters, which we regard as "super nodes", interconnected by a scale-free 
tree. We will see that due to optimization this scale-free tree is dominated by links having 
high weights — significantly higher than the percolation threshold p c . Hence, the MST 
naturally distinguishes between links below and above the percolation threshold, leading to 
a scale-free "supernode network". Our results may explain the origin of scale-free degree 
distribution in some real world networks. 

Consider an Erdos-Renyi (ER) graph with N nodes and an average degree (k), thus having 
a total of N(k)/2 links. To each link we assign a weight chosen randomly and uniformly 
from the range [0, 1]. We define black links to be those links with weights below a threshold 
p c = l/(k) Two nodes belong to the same cluster if they are connected by black links 
[Fig. 1(a)]. From percolation theory Q] follows that the number of clusters of s nodes scales 
as a power law, n s ~ s~ T , with r = 2.5 for ER networks ^J. Next, we merge all nodes 
inside each cluster into a single "supernode" [1^. We define a new "supernode network" 



[Fig. 1(b)] of iV sn supernodes [15]. The links between two supernodes [see Figs. [Ha) and 
[D^b)] have weights larger than p c . 
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The degree distribution P(k) of the supernode network can be obtained as follows. Every 
node in a supernode has the same (finite) probability to be connected to a node outside the 
supernode. Thus, we assume that the degree k of each supernode is proportional to the 
cluster size s, which obeys n s ~ s~ 2,5 . Hence P(k) ~ k~ x , with A = 2.5, as supported by 
simulations shown in Fig. [21 

We next show that the minimum spanning tree (MST) on an ER graph is related to 
the supernode network, and therefore also exhibits scale-free properties. The MST on a 
weighted graph is a tree that reaches all nodes of the graph and for which the sum of the 
weights of all the links (total weight) is minimal. Also, each path between two sites on the 
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MST is the optimal path in the "strong disorder" limit 
path the maximum barrier (weight) is the smallest possible 

Standard algorithms for finding the MST are Prim's algorithm, which resembles 
invasion percolation, and Kruskal's algorithm, which resembles percolation. An equivalent 
algorithm to find the MST is the "bombing algorithm" Q, Q- We start with the full ER 
network and remove links in order of descending weights. If the removal of a link disconnects 
the graph, we restore the link and mark it "gray" [20]; otherwise the link [shown dotted in 
Fig. 1(a)] is removed. The algorithm ends and an MST is obtained when no more links can 
be removed without disconnecting the graph. 

In the bombing algorithm, only links that close a loop can be removed. Because at 
criticality loops are negligible [ill l3 ] for ER networks (d — > oo), bombing does not modify 
the percolation clusters — where the links have weights below p c . Thus, bombing modifies 
only links outside the clusters, so actually it is only the links of the supernode network that 
are bombed. Hence the MST resulting from bombing is composed of percolation clusters 
connected by gray links [Fig. H^c)]. 

From the MST of Fig. [TJc) we now generate a new tree, the MST of the supernode 
network, which we call the "gray tree" , whose nodes are the supernodes and whose links are 
the gray links connecting them [see Fig.^d)]. Note that bombing the original ER network to 
obtain the MST of Fig. QJc) is equivalent to bombing the supernode network of Fig. ^b) to 
obtain the gray tree, because the links inside the clusters are not bombed. We find [FigE^a)] 
that the gray tree has also a scale-free degree distribution P(k), with A = 2.5 — the same as 
the supernode network |2l|. We also find [Fig. Efb)] the average path length £ gray scales as 



-gray ~ 



logiVsn ~ logiV [l5j, I22J. Note that even though the gray tree is scale-free, it is not 



3 



ultra-small ^J, since the length does not scale as log log N. 

Next we show that our optimization of the MST, which leads to the gray tree, yields a 
significant separation between the weights of the links inside the supernodes and the links 
connecting the supernodes. We consider each pair of nodes in the original MST of iV nodes 
[Fig. 1(c)] and calculate the typical path length £ typ , which is the most probable path length 
on the MST. For each path of length £ ty p we rank the weights on its links in descending 
order. For the largest weights ("rank 1 links"), we calculate the average weight w r= \ over 
all paths. Similarly, for the next largest weights ("rank 2 links") we find the average w r= 2 
over all paths, and so on up to r = £ typ . The inset in Fig. HI shows w r as a function of 
rank r for three different network sizes N = 8000, 16000, and 32000. In Fig. 0] we plot 
the difference in consecutive average weights, Aw r = w r — w r -i as a function of w r . We 
see that weights below p c (black links inside the supernodes) are uniformly distributed and 
approach one another as N increases. As opposed to this, weights above p c ("gray links") 
are not uniformly distributed, due to the bombing algorithm, and are independent of N. 
The latter links with the highest weights can be associated with gray links from very small 
clusters [Figs. 1(a) and 1(c)]. These links almost cannot be bombed due to limited number 
of exits from small clusters, and therefore do not change with N. Moreover, because of the 
abundance of small clusters (n s ~ s~ T ), large clusters are connected mostly to small clusters 
(through links with relatively large weights). 

We thereby obtain a scale-free network with A = 2.5, which is not very sensitive to the 
precise value of the threshold used for defining the supernodes. For example, the scale-free 
degree distribution shown in Fig. E(a) for a threshold of p c + 0.01 corresponds to having 
only four largest weights on the optimal paths [see Fig. 0] . This means that mainly very 
small clusters, connected with high-weight links to large clusters, dominate the scale-free 
distribution P(k) of the MST of the supernode network (gray tree). Hence, the optimization 
process on an ER graph causes a significant separation between links below and above p c 
to emerge spontaneously in the system, and by merging nodes connected with links of low 
weights, a scale-free network can arise. 

The process described above may be related to the evolution of some real world networks. 
Consider a homogeneous network with many components whose average degree (k) is well 
defined. Suppose that the links between the components have different weights, and that 
some optimization process separates the network into nodes which are well connected (i.e., 
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connected by links with low weights) and nodes connected by links having much higher 
weights. If the well-connected components merge into a single node, this results in a new 
heterogeneous supernode network with components that vary in size, and thus in number 
of outgoing connections. 

An example of a real world network whose evolution may be related to this model is 



the protein folding network, which was found to be scale-free with A ~ 2.3 |7[. The nodes 
are the possible physical configurations of the system and the links between them describe 
the possible transitions between the different configurations. We assume that this network 
is optimal because the system chooses the path with the smallest energy barrier from all 
possible trajectories in phase space. It is possible that the scale-free distribution evolves 
through a similar procedure as described above for random graphs: adjacent configurations 
with close energies (nodes in the same cluster) cannot be distinguished and are regarded as 
a single supernode, while configurations (clusters) with high barriers between them belong 
to different supernodes. 

A second example is computer networks. Strongly interacting computers (such as com- 
puters belonging to the same university) are likely to converge into a single domain, and 
thus domains with various sizes and connectivities are formed. This network might be also 
optimal, because packets destined to an external domain are presumably routed through the 
router which has the best connection to the target domain. 

To summarize, we have seen that any weighted random network hides an inherent scale- 
free "supernode network" [23]. We showed that the minimum spanning tree, generated by 
the bombing algorithm, is composed of percolation clusters connected by a scale-free tree 
of "gray" links. Most of the gray links connect small clusters to large ones, thus having 
weights well above the percolation threshold that do not change with the original size of the 
network. Thus the optimization in the process of building the MST distinguishes between 
links with weights below and above the threshold, leading to a spontaneous emergence of a 
scale-free "supernode network" . We raise the possibility that in some real world networks, 
nodes connected well merge into one single node, and through a natural optimization a 
scale-free network emerges. 
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(a) Original Network 



(c) MST of original Network 
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(b) Supernode network (d) MST of Supernode network 




FIG. 1: Sketch of the "supernode network", (a) The original ER network, partitioned into 
percolation clusters whose sizes s are power-law distributed, with n s ~ s~ T where r = 2.5 for ER 
graphs. The "black" links are the links with weights below p c , the "dotted" links are the links 
that are removed by the bombing algorithm, and the "gray" links are the links whose removal will 
disconnect the network (and therefore are not removed even though their weight is above p c ). (b) 
The "supernode network": the nodes are the clusters in the original network and the links are 
the links connecting nodes in different clusters (i.e., "dotted" and "gray" links). The supernode 
network is scale-free with P(k) ~ k~ x and A = 2.5. Notice the existence of self loops and of 
double connections between the same two supernodes. (c) The minimum spanning tree (MST), 
composed of black and gray links only, (d) The MST of the supernode network ("gray tree"), 
which is obtained by bombing the supernode igetwork (thereby removing the "dotted" links), or 
equivalently, by merging the clusters in the MST to supernodes. The gray tree is scale-free, with 
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FIG. 2: The degree distribution of the supernode network of Fig. 1(b), where the supernodes 
are the percolation clusters, and the links are the links with weights larger than p c (O)- The 
distribution exhibits a scale-free tail with A ~ 2.5. If we choose a threshold less than p c , we obtain 
the same power law degree distribution with an exponential cutoff. The different symbols represent 
slightly different threshold values: p c — 0.03 (□) and p c — 0.05 (A). The original ER network has 
N = 50, 000 and (k) = 5. Note that for k ~ (k) the degree distribution has a maximum. 
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FIG. 3: (a) The degree distribution of the "gray tree" (the MST of the supernode network, shown 
in Fig. md)), in which the supernodes are percolation clusters and the links are the gray links. 
Different symbols represent different threshold values: p c (0)> Pc + 0.01 (□) and p c + 0.02 (A). 
The distribution exhibits a scale- free tail with A ~ 2.5, and is relatively insensitive to changes in 
p c . (b) The average path length ^ gr ay on a the gray tree as a function of original network size. It 
is seen that ^ gr ay ~ log N sn ~ log N. 
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FIG. 4: The inset shows, for an ER graph with (k) = 5, the average weights w r along the optimal 
paths, sorted according to their rank. The main figure shows Aw r = w r +i — w r , where w r is 
the mean weight for rank r, vs. the weights along the optimal path. Different symbols represent 
different system sizes: N = 8000 (O), N = 16,000 (□) and N = 32,000 (A). Below p c = 0.2, 
Aw r decreases for increasing N, while weights w r well above p c do not change with N. 
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